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Abstract 

Background: Experience from progeny-testing indicates that the mating of popular bull sires that have high 
estimated breeding values with excellent dams does not guarantee the production of offspring with superior 
breeding values. This is explained partly by differences in the standard deviation of gamete breeding values 
(SDGBV) between animals at the haplotype level. The SDGBV depends on the variance of the true effects of 
single nucleotide polymorphisms (SNPs) and the degree of heterozygosity. Haplotypes of 58 035 Holstein animals 
were used to predict and investigate expected SDGBV for fat yield, protein yield, somatic cell score and the direct 
genetic effect for stillbirth. 

Results: Differences in SDGBV between animals were detected, which means that the groups of offspring of 
parents with low SDGBV will be more homogeneous than those of parents with high SDGBV, although the 
expected mean breeding values of the progeny will be the same. SDGBV was negatively correlated with 
genomic and pedigree inbreeding coefficients and a small loss of SDGBV over time was observed. Sires that 
had relatively low mean gamete breeding values but high SDGBV had a higher probability of producing 
extremely positive offspring than sires that had a high mean gamete breeding value and low SDGBV. 

Conclusions: An animal's SDGBV can be estimated based on genomic information and used to design 
specific genomic mating plans. Estimated SDGBV are an additional tool for mating programs, which allows 
breeders to identify and match mating partners using specific haplotype information. 



Background 

Within the last years, dairy cattle breeding schemes have 
changed drastically with the availability of routine dense 
single nucleotide polymorphism (SNP) chips. Initially, 
research focused mainly on estimation of genomic breed- 
ing values [1-3] and more recently, on imputation from 
low-density marker sets to denser marker sets [4-6]. In 
addition to genomic breeding values, other information 
can also be derived from dense marker information, such 
as parentage verification [7]. In addition, VanRaden et al. 
[8] identified haplotypes with genetic lethal effects that 
may lead to embryonic death in the homozygous state. 
Moreover, genetic characteristics such as horn status [9] 
can be predicted with routine SNP information. 
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In addition, genotyping large numbers of animals and 
dense SNP datasets makes it possible to characterize gen- 
etic variation at the chromosome and haplotype levels 
[10,11]. Consequently, SNP haplotype information can be 
used to estimate the expected variance of breeding values 
at the gamete level. Variation between gametes is gener- 
ated by random sampling of parental haplotypes during 
meiosis [11] if the dam and/or the sire are heterozygous. 

Knowledge on the mean (MGBV) and standard devi- 
ation of gamete breeding values (SDGBV) assuming nor- 
mally distributed estimated breeding values allows the 
development of specific mating plans. For example, the 
probability that the breeding value of an offspring exceeds 
a certain threshold can be estimated. In addition, it is pos- 
sible to predict the number of animals to be tested to pro- 
duce an offspring with an estimated breeding value above 
a given threshold. Cole and VanRaden [11] discussed the 
possibility of selecting animals for which gamete breeding 
values vary little, in order to produce more homogeneous 
progeny and simplify herd management. Conversely, 
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breeding companies may be more interested in heteroge- 
neous progeny to increase the probability of extremely 
positive offspring. In line with this, experience with 
progeny-testing indicates that the use of popular sires with 
high estimated breeding values and many tested offspring 
does not guarantee that male offspring with superior 
breeding values are produced. In contrast, bulls for which 
fewer male offspring are tested sometimes produce more 
excellent offspring than popular bulls. 

The objective of this study was to predict and investi- 
gate the expected SDGBV using genomic information 
and to demonstrate its usefulness to improve mating 
decisions. 

Methods 

Data 

A total of 58 035 Holstein animals genotyped with 
the Illumina BovineSNP50 BeadChip (Illumina Inc., 
San Diego, CA, USA) obtained from routine genomic 
evaluation for German Holsteins [3] (February 2013) 
were chosen for the study. Of the 50 k SNPs on this chip, 
43 586 autosomal SNPs that had a minor allele frequency 
greater than 1% were selected. The algorithm reported by 
Hayes [12] was used to check whether genotype informa- 
tion agreed with the pedigree information. Only genotypes 
with a call rate greater than 98% were used. The software 
package Beagle (version 3.3, [13]) with default settings was 
used for imputation of missing marker genotypes and for 
phasing the genotypes. For this purpose, Beagle uses link- 
age disequilibrium at the population level. The order of 
the SNPs on the chromosomes was based on the UMD3.1 
bovine genome assembly [14]. 

Four traits (fat yield, protein yield, somatic cell score 
and the direct genetic effect for stillbirth) with different 
genetic architectures, heritabilities and genomic reliabil- 
ities were chosen. SNP effects were estimated with a 
BLUP model assuming trait-specific residual polygenic 
variance (for more details on the model see [3]). 

Pedigree and genomic relationships 

The pedigree contained 58 035 genotyped animals (15 816 
females and 42 219 males) and their 136 477 ancestors. 
All sires and dams of the genotyped animals were known. 
The animals were born between 1960 and 2013 and were 
descendants from 2768 different sires and 32 416 different 
dams. Genomic inbreeding coefficients were calculated by 
setting up the diagonal elements of the genomic relation- 
ship matrix, as suggested by VanRaden [15]. Allele fre- 
quencies in the base population were estimated using the 
gene content method described by Gengler et al. [16]. 

Flow of information 

A scheme of the flow of information through the differ- 
ent steps of the estimation of MGBV and SDGBV is in 



Figure 1. First, the software package Beagle was used 
to phase the SNP genotypes and construct haplotypes. 
The haplotypes, SNP effects, and in order to define 
haplotype size, a map of recombination events were 
used to estimate haplotype specific breeding values (pro- 
gram hapDGV.f90). These results were the inputs for 
estimating MGBV and SDGBV (program genvar.f90). 
The resulting data and the pedigree and animal ownership 
information were then used for the mating software. 

Prediction of mean and standard deviation of gamete 
breeding values 

MGBV and SDGBV were obtained by sampling different 
sets of transmitted haplotypes from the animals. In theory, 
with 29 autosomal chromosomes and ignoring the sex 
chromosome, there are 2 29 possible combinations of sam- 
pled haplotypes if the length of a haplotype is defined as 
one autosome and recombination is ignored. Assuming 
that, on average, one recombination occurs per centi- 
Morgan, there is a near unlimited number of possible 
combinations of haplotypes. Thus, to make the simulation 
computationally feasible and to reduce the number of 
haplotype combinations, the genome was divided into 
1856 chromosome segments (C) according to positions in 
the genome where a high number of recombination events 
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Figure 1 Flow of data and programs used to estimate MGBV 
and SDGBV. 
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occurred. These recombination events were identified in a 
preliminary study (results not shown here) in which a 
whole genome map of the number of crossing-over events 
was derived by identifying phase switches between the 
haplotypes of the sires and the paternal haplotypes of their 
sons. 

In the first step of the simulation of the SDGBV within 
an animal (program hapDGV.f90), the parental and ma- 
ternal haplotype breeding values for each animal were 
calculated as: 

h = YH=i Zk i ak ' 

where hij is the i th haplotype, with j the indicator of mater- 
nal or paternal haplotype, z is the maternal or paternal al- 
lele of marker k, is half of the estimated effect of the k th 
SNP from routine genomic evaluation of German Holstein 
cattle [3], and n is the number of SNPs belonging to the 
i th haplotype. Imprinting, dominance and epistasis were 
not considered in the simulation. In the second step, using 
the program genvar.f90, 100 000 possible gametes were 
simulated by selecting either the maternal or paternal 
phase from an animal. At the beginning of the chromo- 
some, the probability of selecting the maternal or paternal 
strand was equal to 50%. Location of cross-overs was 
implemented in the simulation based on a uniform distri- 
bution over the interval [0,C] (C being the number of 
chromosome segments). The mean recombination rate 
between the haplotype strands was set to 0.3, which is in 
line with the number of expected recombinations assum- 
ing one recombination per Morgan. 
The MGBV of a parent was calculated as: 

MGBv = -y N y H h«, 

where N is the number of replicates of the simulation, H 
is the number of haplotypes, and h^ is the i th parental or 
maternal haplotype breeding value. 
The SDGBV of a parent was calculated as: 

Correlations between traits were analyzed for MGBV 
and SDGBV to investigate relationships between traits. 
To study whether selection, which should result in in- 
creased inbreeding and homozygosity per generation, 
had an antagonistic effect on MGBV and SDGBV, corre- 
lations of SDGBV and MGBV with the genomic (F G ) 
and the pedigree (F P ) inbreeding coefficients were com- 
puted for each trait. Furthermore, MGBV and SDGBV 
were tested for normality. 



Validation 

Results of the simulation were validated by reconstruct- 
ing the paternally transmitted haplotype for each animal. 
Then the paternally transmitted haplotype breeding 
value was estimated, by summing the paternally trans- 
mitted haplotype, which in this case refers to haploid 
chromosomes, with half the estimated SNP effects. A 
sensitivity analysis was performed to determine the size 
of the progeny groups per sire needed for validation. 
The observed mean and standard deviation of the esti- 
mated breeding values of the offspring were compared 
with the mean and standard deviation obtained from the 
simulation and correlations were computed. 

Mating plan 

Subsequent to the prediction of MGBV and SDGBV, 
specific matings were designed using newly developed 
mating software, which also includes animal ownership 
information and pedigree data. The expected mean 
breeding value of a potential offspring was calculated as: 

mBV = MGBV S + MGBV d , 

where mBV is the expected breeding value of an off- 
spring based on the parental average estimated breeding 
values, MGBV S is the estimated mean gamete breeding 
value of the sire, and MGBV d is the estimated mean 
gamete breeding value of the dam. 

Standard deviation of breeding values of the progeny, 
assuming no covariance between sire and dam, was cal- 
culated as: 

sBV = a/SDGBVs 2 + SDGBV d 2 , 

where sBV is the expected standard deviation of breeding 
values within the potential offspring of the same mating, 
SDGBV S is the standard deviation of gamete breeding 
values of the sire, and SDGBV d is the standard deviation 
of gamete breeding values of the dam. In addition, the 
probability to obtain offspring with a breeding value over 
a given threshold was calculated assuming normally dis- 
tributed breeding values and the number of matings to 
produce at least one offspring with an estimated breeding 
value over a given threshold was calculated using a bino- 
mial distribution. 

Results 

Mean and standard deviation of gamete breeding values 

Figure 2 shows for each trait and animal the relation be- 
tween MGBV and SDGBV. Average MGBV were equal 
to 0.36 genetic standard deviation (a a ) for fat yield, 0.54 
o* a> for protein yield, 0.22 cr a for somatic cell score, and 
0.09 o* a for the direct genetic effect for stillbirth. A mean 
SDGBV of 0.47 a a was obtained for somatic cell score. 
The direct genetic effect for stillbirth had an average 
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Standard deviation within the gamete breeding values ( a a ) 



Figure 2 Relationship between MGBV and SDGBV. Traits investigated were fat yield, protein yield, somatic cell score and the direct genetic 
effect for stillbirth. The red lines indicate means for MGBV and SDGBV. Each dot represents an animal. 



SDGBV of 0.25 o* a . All plots show the presence of animals 
with equal MGBV but significantly different SDGBV. For 
example, for protein yield, bulls with an MGBV of 1.8 o* a 
showed a maximum difference in SDGBV of 0.22 o* a . 

Table 1 contains the observed correlations between 
the MGBV for the four traits, the genomic (F G ) and the 
pedigree (F P ) inbreeding coefficients. The correlation be- 
tween MGBV was 0.66 for fat yield with protein yield 
and 0.15 for somatic cell score with the direct genetic ef- 
fect for stillbirth. Correlation of SDGBV was lower with 
F G than with F P . 

Correlations among SDGBV for the four traits are in 
Table 2. These correlations were lower than correlations 
among MGBV. Correlation between SDGBV was highest 
for fat yield with protein yield (0.41). Correlations between 
SDGBV for the other traits ranged from 0.05 to 0.13. 
For all traits, correlations between SDGBV and F P were 

Table 1 Correlations between MGBV among traits and 
with inbreeding coefficients 



Item 


MGBVfy 


MGBV SCS 


MGBV S B d 


F G 


F P 


MGBVpy 


0.66 


0.13 


0.06 


0.01 


0.14 


MGBVfy 




0.15 


0.06 


0.02 


0.13 


MGBV SCS 






0.15 


0.05 


0.11 


MGBVsBd 








-0.02 


0.05 


F G 










0.52 



MGBV PY : mean gamete breeding value for protein yield; MGBV FY : mean 
gamete breeding value for fat yield; MGBV SCS : mean gamete breeding value 
for somatic cell score; MGBV SBd : mean gamete breeding value for the direct 
genetic effect for still birth; F G : genomic inbreeding coefficient; F P : pedigree 
inbreeding coefficient. 



negative. Correlations between SDGBV and F G were also 
negative for all traits and two to four times larger than 
correlations between SDGBV and F P . 

The MGBV showed no difference between theoret- 
ical and sampled quintiles of the normal distribution 
function for any of the studied traits (results not 
shown). Figure 3 shows Q-Q plots for SDGBV for the 
four traits. The graphs indicate that the classes in the 
middle of the distribution were almost normally dis- 
tributed for all traits. For the more extreme classes, 
especially for animals with a SDGBV for fat yield lower 
than 0.35 o* a , a substantial deviation from the normal dis- 
tribution was observed. 

Changes in SDGBV over time are in Figure 4. Similar 
to Figure 2, the SDGBV was highest for somatic cell 
score. The SDGBV for the direct genetic effect for still- 
birth was only half of the SDGBV for somatic cell score. 
All traits indicated a slightly negative trend of SDGBV 
over the last decades. Regression of SDGBV on birth year 
indicated that the decline in SDGBV was greatest for som- 
atic cell score (-0.0012 a a per year), followed by fat yield 
(-0.00087 a a per year). 

Validation of simulated SDGBV 

Table 3 shows a sensitivity analysis to determine the size 
of the progeny groups needed for validation. Sires with 
more than 150 offspring are a good compromise be- 
tween size of the group of offspring and number of sires 
available. In this case, correlations between the observed 
real progeny variation with the simulated SDGBV were 



Segelke et al. Genetics Selection Evolution 2014, 46:42 
http://www.gsejournal.Org/content/46/1/42 



Page 5 of 10 



Table 2 Correlation between SDGBV among traits and 
with inbreeding coefficients 



Item 


SDGBV F y SDGBV scs 


SDGBV S Bd 


F G 


F P 


SDGBVpy 


0.41 0.09 


0.11 


-0.19 


-0.09 


SDGBVfy 


0.06 


0.05 


-0.10 


-0.06 


SDGBV SC s 




0.13 


-0.22 


-0.08 


SDGBVsBd 






-0.23 


-0.05 


F G 








0.52 



SDGBV PY : Standard deviation of gamete breeding values for protein yield; 
SDGBV FY : Standard deviation of gamete breeding values for fat yield; 
SDGBVscs: Standard deviation of gamete breeding values for somatic cell 
score; SDGBV SBd : Standard deviation of gamete breeding values for the direct 
genetic effect for stillbirth; F G : genomic inbreeding coefficient; F P : pedigree 
inbreeding coefficient. 

highest for fat yield (r = 0.93), followed by protein yield 
and somatic cell score (r = 0.90), while the direct genetic 
effect for stillbirth had the lowest correlation (r = 0.78). 

Mating schemes 

Table 4 and Figure 5 show results from the mating of two 
bulls that have extremely different SDGBV for protein 
yield, with a poor, average and superior female from the 
population. In addition, Table 4 contains the probabilities 
of producing an offspring with a breeding value exceeding 

0, 1, 2, 3 and 4 a a and the number of animals to be tested 
to produce at least one animal with a breeding value ex- 
ceeding a fixed threshold. Resulting distributions of the 
potential offspring were quite different between the two 
bulls. Mating of bull 1 with an average cow of the popula- 
tion is expected to produce animals with the highest mBV, 

1. e. 2.36 a a . The same mating of bull 2 will generate 



animals with a slightly lower expected mBV, i.e. 2.23 a a . 
However, a bull that has the highest mean does not guar- 
antee the highest probability of producing offspring with a 
breeding value greater than 3 or 4 o* a . In this case, bull 2 
had the highest probability of producing such offspring, 
but its probability of having progeny with an extreme 
negative breeding value was also greater. Similarly, the 
number of animals to be tested to find at least one animal 
with a mBV higher than 2 o* a was highest for bull 2. To 
produce extreme animals with a gamete breeding value 
higher than 3 or 4 o* a , more progeny had to be tested for 
bull 1 than for bull 2. Choosing a poor or a superior dam 
instead of an average cow changed the mean breeding 
value of the potential offspring, but did not substantially 
change the likelihood of obtaining offspring with ex- 
tremely low or high breeding values. 

Discussion 

The objective of this study was to predict the expected 
genetic standard deviation within groups of offspring 
using real data. The results indicate that gamete breed- 
ing values vary between animals and these results can be 
used to make specific mating decisions. 

Gamete variation 

MGBV and SDGBV for direct genetic effect for stillbirth 
were about half as high as for the three other traits 
(Figure 2 and Figure 4), which is related to differences in 
the reliabilities of the direct genomic breeding values 
(DGV) between these traits. The reliability of DGV for fat 
and protein yields is equal to 69% and for somatic cell 
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Figure 4 Changes in SDGBV for fat yield, protein yield, somatic cell score and the direct genetic effect for stillbirth for animals born 
between 1990 and 2012. 



score to 74%, but only 44% for the direct genetic effect for 
stillbirth [3]. Accordingly, the SNP effects for the direct 
genetic effect for stillbirth are more regressed to the mean 
than for the other traits. 

In comparison to the SNP-effect reference population, 
high MGBV for protein and fat yields can be explained 
by higher selection intensities and genetic gains than for 
somatic cell score and the direct genetic effect for still- 
birth. Comparing the three different traits with similar 
reliabilities indicates that protein yield had the highest 
MGBV but the lowest SDGBV. This is explained by a 
higher selection intensity for protein yield, which is 
caused by a higher weight on this trait in the German 
Total Merit Index [17]. However, up to now most 

Table 3 Correlations (r) between SDGBV with real 
progeny variations for different traits per minimum 
number of offspring per sire 



Minimum number Number of sires r FY 
of offspring per sire 


r p Y 


r scs 


r SBd 


10 


409 


0.65 


0.56 


0.60 


0.50 


50 


146 


0.90 


0.78 


0.80 


0.72 


100 


84 


0.93 


0.83 


0.88 


0.69 


150 


48 


0.93 


0.90 


0.90 


0.78 


200 


32 


0.93 


0.91 


0.87 


0.85 


300 


20 


0.96 


0.93 


0.94 


0.82 


500 


7 


0.98 


0.88 


0.90 


0.90 


PY = protein yield; FY = fat yield; SCS = somatic cell score; SBd = 


the direct 



genetic effect for stillbirth. 



genotyped animals are elite animals, which means that 
the genotyped animals are highly preselected. From this 
point of view, the high MGBV for protein and fat yields 
may not represent the mean breeding value of the 
German Holstein population. In contrast, MGBV for 
somatic cell score and for the direct genetic effect for 
stillbirth are closer to the mean value of the popula- 
tion since these traits are not as relevant for selec- 
tion. Similarly, Cole and Null [10], pointed out that 
most genotyped animals are elite animals, which have 
more chromosomes with a desirable DGV than chro- 
mosomes with an undesirable DGV. 

Negative correlations between F G and SDGBV (Table 2) 
are in agreement with [11]. These authors reported a stron- 
ger correlation of the Mendelian sampling variance (simi- 
lar to the square of SDGBV) with F G than with F R which 
is caused by pedigree errors. 

For animals with a low standard deviation of fat yield, 
the Q-Q plot (Figure 3) showed a high divergence be- 
tween the theoretical normal distribution and the sampled 
distribution. Cole and Null [10] indicated that mutations 
with large effects like DGAT1 [18] should explain a higher 
proportion of the genetic variance than the expected vari- 
ance based on the relative length of the chromosome. To 
check if the DGAT1 locus has an effect on the distribution 
of SDGBV, two scenarios were analyzed (Figure 6). In the 
first scenario, the SDGBV for fat yield was predicted in- 
cluding all 43 586 SNPs. Results showed a bivariate 
distribution with SDGBV ranging from 0.25 to 0.6 a a . 
In the second scenario, haplotypes in a region of 2.2 Mbp 
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Table 4 Results of mating two sires to a poor, average and superior female in the population for protein yield 



Sire a a 


Dam a a 


Offspring a a 






P \/o) 










M 






MGBV 


SDGBV 


MGBV 


SDGBV 


mBV 


sBV 


0a a 


1a a 


2a a 


3o a 


4a a 


0a a 


1a a 


2a a 


3a a 


4a a 


1.81 


0.29 


-1.40 


0.44 


0.41 


0.53 


78.0 


13.3 


0.1 


0 


0 


5 


48 


6904 






1.68 


0.52 






0.28 


0.68 


66.0 


14.5 


0.6 


0 


0 


6 


44 


1147 






1.81 


0.29 


0.55 


0.39 


2.36 


0.49 


100 


99.7 


76.9 


9.6 


0 


1 


1 


5 


68 




1.68 


0.52 






2.23 


0.65 


100 


97.1 


63.8 


11.8 


0.3 


1 


2 


7 


55 


2299 


1.81 


0.29 


2.12 


0.32 


3.93 


0.43 


100 


100 


100 


98.5 


43.5 


1 


1 


1 


1 


12 


1.68 


0.52 






3.80 


0.61 


100 


100 


99.8 


90.5 


37.2 


1 


1 


1 


3 


15 



The table shows the mating of two sires to three cows and the resulting mean and standard deviation of the potential offspring. In addition, the table shows the 
probability (p) and minimum number of animals (N) to test, to generate at least one offspring over 0, 1, 2, 3, or 4 genetic standard deviations (o a ) for protein yield. 



surrounding the DGAT1 locus were excluded from the 
SDGBV prediction. Under this scenario, SDGBV showed a 
normal distribution with a lower mean and lower range 
than for scenario 1. This indicates that the SDGBV 
for a specific trait depends on its genetic architecture. 
The larger the effect on the trait and the more the allele 
frequency of this mutation is close to 0.5, the higher is the 
influence on the SDGBV, which results in a deviation from 
the normal distribution. Thaller et al. [19] reported an 
allele frequency of 0.55 for Holstein animals for the 
lysine-encoding variant (K232A) of the DGAT1 gene. Fur- 
thermore, for the direct genetic effect for stillbirth, sev- 
eral investigations [20,21] have indicated the presence 
of a quantitative trait loci (QTL) on chromosome 18 
with a high influence on calving traits. Haplotype analyses 
demonstrated that a haplotype of 19 SNPs explains 16% 
of the estimated breeding value variance for the direct 
genetic effect for stillbirth (results not shown here). How- 
ever, the influence of this QTL on SDGBV for direct gen- 
etic effect for stillbirth was less than the effect of DGAT1 
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Figure 5 Distribution of the breeding values of offspring for 
protein yield. Two bulls (with MGBV equal to 1.81 o a and 1.68 
o a and SDGBV equal to 0.29 o a and 0.52 o a , respectively) are 
mated with an average female of the population (MGBV equal 
to 0.55 o a; SDGBV equal to 0.39 o a ). 



on the SDGBV for fat yield. Differences in allele frequen- 
cies of the DGAT1 gene and of the QTL for the direct 
genetic effect for stillbirth might explain these findings. 

Validation of simulated gamete variation 

Simulated SDGBV can only be validated for sires that 
have large groups of offspring. A validation independent 
from genomic information is only possible by comparing 
the SDGBV of a bull with the standard deviation of the 
phenotype-based estimated breeding values of its sons. 
However, only some very popular sires have a large num- 
ber of offspring with phenotype-based estimated breeding 
values. Using genomic information, many animals can be 
tested at a relatively low cost compared to the costs of 
progeny-testing of bulls, which makes it possible to inves- 
tigate the standard deviation of genomic breeding values 
within groups of offspring. Another approach to investi- 
gate and validate the standard deviation within groups of 
offspring is to use daughter yield deviations corrected for 
the contribution of the dam. One benefit of this approach 
is that many sires have very large groups of female off- 
spring because of artificial insemination. Figure 7 shows 
the trend over time of the mean haplotype breeding values 
that progeny inherit from their sire and dam. Results show 
a near linear trend for fat and protein yields, but the pater- 
nal haplotype had a higher intercept and steeper slope 
than the maternal haplotype. An interesting point is the 
decrease in paternal MGBV for birth year 2003. Analysis 
of the 2002, 2003 and 2004 tested birth cohorts (650 bulls 
per year) also indicate a decrease in mean breeding values 
for fat yield (0.33 a a , 0.25 o* a , 0.43 cr a ) and protein yield 
(0.55 cr a , 0.46 cr a , 0.71 aj for the 2003 birth cohort. This 
decrease is mainly caused by the offspring of three sires 
which pre-dominated in this birth year. On average, these 
groups had breeding values for fat and protein yields that 
were more than one o* a lower than the pre-dominating 
groups of offspring in the birth cohorts in 2002 and 2004. 
In contrast to the gamete breeding values for fat and pro- 
tein yields, no clear difference in gamete breeding values 
between maternal and paternal haplotypes was found for 
somatic cell score until the 2010 birth year. From birth 
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year 2010 to 2013, the paternal haplotype was superior to 
the maternal haplotype. One explanation is that more and 
more genomically selected sires were used to produce ani- 
mals born between 2010 and 2013. In contrast, due to 
genotyping costs, many dams were not genomically se- 
lected, which results in lower genetic gain on the female 
side. For gamete breeding values for the direct genetic ef- 
fect for stillbirth, there was no genetic trend for either ma- 
ternal or paternal haplotype breeding values because the 
direct genetic effect for stillbirth does not seem to be a 
trait under intense selection. However, Figure 7 shows that 
for fat and protein yields there is a difference between 
sires and dams, which has to be taken into account in the 
validation. The gap between estimated sire and dam 
haplotype breeding values can be reduced by increasing 
genotyping and selection intensity in the dams-to-bulls 
and dams-to-cows selection paths. 



Systematic genotyping of young Holstein Friesian can- 
didates started in 2010. This implies that animals born 
before 2010 were selectively genotyped because of their 
importance for the breeding scheme and their contribu- 
tion to the reference population. The within-family vari- 
ance of older families could be affected by this selective 
genotyping. Genotyping more animals results in larger 
groups of offspring from randomly genotyped sires, 
which should result in improved future validations. 

Van Raden et al. [8] and Fritz et al. [22] reported that 
some haplotypes are never present in the homozygous 
state, because embryos that are homozygous for these 
haplotypes are not viable. This fact and genetic defects 
like Brachyspina [23,24], Bovine Leukocyte Adhesion 
Deficiency (BLAD [25]) or Complex Vertebral Malforma- 
tion (CVM [26]) also influence the SDGBV. However, the 
effect on the variation depends on the allele frequency in 
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the population; thus a loss of variation can be observed 
only when sperm and ovum carry the same genetic defect. 
This fact can explain the difference between simulated 
and observed realized gamete breeding values, because 
the simulation did not consider loss of variation due to 
genetic defects. Indeed, gamete breeding values rather 
than animal breeding values were simulated and a carrier 
of a genetic defect had no influence on SDGBV if the mat- 
ing partner did not carry this defect. 

Mating designs 

Figure 2 shows that there are animals with a high mean 
and a low variability that are relevant for dairy farmers. In 
particular, animals with a high mean and a high standard 
deviation are interesting for AI companies because select- 
ing these animals will increase the probability of produ- 
cing animals with extremely positive breeding values in 
the future. 

Haplotype information enables the estimation of selec- 
tion limits. Summing up the best breeding value for each 
haplotype will give the theoretically best animal. The gam- 
ete breeding values of these hypothetical animals should 
reach +30 a a (707 kg) for fat yield, +32 a a (539 kg) for pro- 
tein yield, +35 o* a somatic cell score and +14.2 cr a for the 
direct effect of still birth. Cole and VanRaden [11] showed 
that the selection limit for protein yield was 1138 kg. Al- 
though our results are estimated at the haplotype level 
and those of [11] at the animal level, they are consistent. 
Theoretical mating of the two best animals for protein 
yield in our dataset would produce animals with a mean 
estimated breeding value of 4.82 a a and a standard devi- 
ation of 0.76 cr a . The probability to produce an offspring 
with a breeding value higher than 8 a a is 0.14%, which is 
only one third of the selection limit, which illustrates that 
animals from the current population are far from the se- 
lection limits. 

Figure 5 and Table 4 show that two different mating 
strategies can be designed based on knowledge about 
MGBV and SDGBV. On the one hand, AI companies are 
interested in finding extremely positive offspring and, 
from this point of view, mating bull 2 would be the best 
choice. On the other hand, farmers are more interested in 
homogeneous groups of offspring with low SDGBV, which 
means that mating bull 1 would be better for breeding in 
these herds. For computational reasons, no covariance 
between sire and dam was assumed to calculate the vBV. 
Thus, this method has to be improved because the 
German Holstein population has a small effective popula- 
tion size which increases the level of relationships and re- 
sults in a non-zero covariance between sires and dams. 

Finding the best combination of mating partners in mat- 
ing programs that are based on genomic information re- 
quires time- and memory-intensive computing because of 
the large amount of data. A great benefit of the method 



described in this study is that MGBV and SDGBV need to 
be computed only once for each animal. After this step, it 
is computationally easy to find mating partners because 
mBV or vBV is the sum of maternal and paternal MGBV 
or SDGBV, respectively. Calculating the probability that 
an animal reaches a defined threshold is simple using nor- 
mal distribution functions. Based on this methodology, a 
software tool for breeding associations was developed, 
which includes MGBV and SDGBV for a portfolio of bulls 
of interest and for genotyped cows. Given this informa- 
tion, the association can specify which breeding value 
threshold the offspring of a given cow should exceed and 
the tool provides a list of bulls that are expected to reach 
this criterion. 

Future aspects and applications 

Decreasing genotyping costs makes it possible to geno- 
type whole commercial herds [27]. Considering MGBV and 
SDGBV derived from haplotypes and SNP effect estimates 
is only one example of the use of additional genomic infor- 
mation in genomic mating programs. Ongoing research will 
develop new tools such as the estimation of dominance 
effects [28] or more information about haplotypes with 
specific genomic effects. Software solutions need efficient 
and highly performing programs, which can handle large 
amounts of data within a reasonable timeframe. 

Conclusions 

The expected SDGBV of a potential parent can be esti- 
mated from genomic information. The SDGBV differs 
between animals and tend to be normally distributed in 
the absence of QTL with a large effect on the trait. For 
SDGBV for fat yield, a deviation from a normal distribu- 
tion that is caused by the DGAT1 mutation results in a 
higher SDGBV than expected. Furthermore, for all traits, 
SDGBV decreased slightly in recent years because of an 
increase in the level inbreeding. A genomic mating pro- 
gram was developed to find optimal mating partners 
with respect to expected MGBV and SDGBV. This ap- 
proach also allows the probability of finding an offspring 
with a breeding value exceeding a chosen threshold to 
be calculated. 
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